#--------------------------------------------------------------------------------#
# Authors: 	Diana Da In Lee, Yamil R. Velez
# Title: 	Measuring Descriptive Representation at Scale: Methods for Predicting the Race and Ethnicity of Public Officials
# Date: 	2024-07-01
# Copyright (c) 2021, under the Creative Commons Attribution-Noncommercial-Share Alike 3.0 United States License.
#  For more information see: http://creativecommons.org/licenses/by-nc-sa/3.0/us/
#  All rights reserved. 
#--------------------------------------------------------------------------------#


#--------------------------------------------------------------------------------#
# codebook
#--------------------------------------------------------------------------------#

The main dataset used for the analyses is 'img5_weighted_opencv_cov2f_nofw_level.rds' located in Data/Prediction.

- place_fips: Census state and place FIPS code
- office: candidate electoral position
- office_consolidated: cleaned version of `office`
- district: district number in which a candidate ran
- year: election year
- state: state name
- city: city in which a candidate ran
- first: candidate first name
- surname: candidate last name
- full_name: candidate full name
- votes: total votes a candidate earned
- pid_final: party ID, predicted
- pop_2010: 2010 population size
- gender: candidate gender
- n_winners: number of candidates won in a given election
- winner: whether or not a candidate won the election
- votes_total: total votes in a given election
- voteshare: vote share for a candidate
- incumbent: indicator for incumbent candidate
- s.whi - s.oth: BSO predictions (ask = asian, bla = black, his = hispanic, whi = white, oth = other)
- gs.whi - s.oth: BISG predictions
- fbisg.whi - s.oth: fBISG predictions
- lstm.whi - lstm.oth: LSTM predictions (based on Florida voter file as a training set)
- lstm_wk.whi - lstm_wk.oth: LSTM predictions (based on Wikipedia as a training set)
- lstm_nc.whi - lstm_nc.oth: LSTM predictions (based on NC voter file as a training set)
- ff.bla - ff.asi: FairFace predictions
- vgg.bla - vgg.asi: VGG predictions
- hybrid.bla - hybrid.oth: hybrid method predictions
- race.s - race.hybrid: final racial prediction in each method based on highest predicted probability
